Recognizing Mandarin Chinese Fluent Speech Using Prosody Information—an Initial Investigation
نویسنده
چکیده
The aim of the present paper is to demonstrate how prosody information could be used to recognize Mandarin Chinese fluent speech and what the recognized results imply. By applying our hierarchical prosody framework for fluent speech [1, 2] that specifies boundary breaks and boundary information across phrases and group phrases into speech paragraphs, we were able to develop software that automatically segment speech flow by boundary breaks and label the boundaries systematically. That is, the recognized results are identified speech paragraphs and various levels of prosodic units within each such paragraph. These recognized prosodic units are not unrelated speech units but rather, sister constituents that entail higher-up syntactic as well semantic relationships that cumulatively make up speech paragraphs in fluent continuous speech. Note how this top-down approach differs from most bottom-up approaches. The former offers information from higher up linguistic association whereas the latter treats identified Chinese syllables as discrete unrelated units or lexical words at most, leaving structural information that combines these syllables into linguistically significant units unaddressed. We believe using top-down prosody information may very well offer new breaking ground in fluent speech recognition.
منابع مشابه
Discourse prosody context - global F0 and tempo modulations
The present study is a corpus analysis of discourse prosodic information using two different types of fluent continuous Mandarin speech. Global F0 heights and duration patterns of withinand between-paragraph phrases were compared by discourse positions. Results showed that overall phrase-level F0 height was paragraph-initial>-medial>-final while the tempo pattern was paragraph-initial<-medial<-...
متن کاملFluent speech prosody: Framework and modeling
The prosody of fluent connected speech is much more complicated than concatenating individual sentence intonations into strings. Prosody framework and modeling should base on more understanding of both the production and perception of fluent speech. We analyzed speech corpora of read Mandarin Chinese discourses from a top-down perspective on perceived units and boundaries, and consistently iden...
متن کاملCollecting Mandarin Speech Databases for Prosody Investigations
The prosody of Mandarin running speech is notably marked by grouping of short phrases into perceptually identifiable larger units in the speech flow. An organization of Mandarin speech prosody should not only account for the grouping phenomenon, but also offer some explanation for such grouping in relation to information of other linguistic levels as well as speech planning. The physical, phone...
متن کاملResearch on dynamic characters of Chinese pitch contours
Chinese is a tone language. For a tone, the characters of its F0 pitch contours will be quite different in the condition of continuant speaking from the isolated speaking. The present researches about the Chinese tone are still centralized on the isolated speaking one, and about tone in fluent speech, there are some statements about the phenomenon of the two-word, threeword, four-word co-readin...
متن کاملAn efficient text analyzer with prosody generator-driven approach for Mandarin text-to-speech
A new approach for an efficient text analyser is proposed. The prosody generator-driven method is employed to design an efficient text analyser for Mandarin text-to-speech. More simple structure of text analysis, more suitable classification of linguistic features and more efficient contribution of linguistic features to the prosody generator can be achieved. Three heuristic and theoretical met...
متن کامل